3574 results found.
Multimodal/Multimedia
Corpus,
Language Type:
Bilingual
Languages:
English Spanish
Availability:
Freely Available
License:
Open Data Commons Attribution License (ODC-BY) v1.0
Size:
5000000 words Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:EMPAC: an English–Spanish Corpus of Institutional Subtitles
-
Paper track:Multimodality/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | José Manuel Martínez Martínez | EuroparlTV Multimedia Parallel Corpus (EMPAC) | /N |
Documentation:
There is documentation in English which will be released together with the corpus and its toolkit.
Written
Corpus,
Language Type:
Bilingual
Languages:
Brazilian Portuguese English
Availability:
Freely Available
License:
Creative Commons 2.0
Size:
310 KByte Production Status:
Newly created-finished
Use:
Machine Learning
-
Paper title:NMT and PBSMT Error Analyses in English to Brazilian Portuguese Automatic Translations
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Helena Caseli | FAPESP-PBSMT-NMT | /N |
Documentation:
Documentation available (under construction) in English
Written
Corpus,
Language Type:
Bilingual
Languages:
Brazilian Portuguese English
Availability:
Freely Available
License:
Creative Commons 2.0
Size:
None Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:NMT and PBSMT Error Analyses in English to Brazilian Portuguese Automatic Translations
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Helena Caseli | Revista Pesquisa FAPESP Parallel Corpora | /N |
Documentation:
Description about corpus construction in the paper (Aziz and Specia, 2011) from STIL 2011
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
OpenNeuro
Size:
75 participants OtherProduction Status:
Newly created-finished
Use:
Corpus Creation/Annotation
-
Paper title:The Alice Datasets: fMRI & EEG Observations of Natural Language Comprehension
-
Paper track:Speech/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Shohini Bhattasali | The Alice Datasets: fMRI & EEG Observations of Natural Language Comprehension | /N |
Documentation:
None
Modality Independent
Lexicon,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
40,481 words Production Status:
Existing-used
Use:
Behavioral data
-
Paper title:The Alice Datasets: fMRI & EEG Observations of Natural Language Comprehension
-
Paper track:Speech/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Shohini Bhattasali | The English Lexicon Project | /N |
Documentation:
Balota, David A., Melvin J. Yap, Keith A. Hutchison, Michael J. Cortese, Brett Kessler, Bjorn Loftis, James H. Neely, Douglas L. Nelson, Greg B. Simpson, and Rebecca Treiman. "The English lexicon project." Behavior research methods 39, no. 3 (2007): 445-459.
Modality Independent
Tagger/Parser,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
None Production Status:
Existing-used
Use:
Language Modelling
-
Paper title:The Alice Datasets: fMRI & EEG Observations of Natural Language Comprehension
-
Paper track:Speech/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Shohini Bhattasali | Accurate Unlexicalized Parsing | /N |
Documentation:
https://nlp.stanford.edu/software/lex-parser.shtml
Written
Evaluation Data,
Language Type:
Bilingual
Languages:
English Japanese
Availability:
Not Available
License:
Size:
724 entries Production Status:
Newly created-finished
Use:
Evaluation/Validation
-
Paper title:Evaluation Dataset for Zero Pronoun in Japanese to English Translation
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Sho Shimazu | Dataset for Zero Pronoun in Contextualized Japanese to English Translation | /N |
Documentation:
We attach a documentation, which is written in English, to our constructed dataset.
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Owner
License:
Size:
1131 sentences Production Status:
Newly created-in progress
Use:
Corpus Creation/Annotation
-
Paper title:A Dataset for Investigating the Impact of Feedback on Student Revision Outcome
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Ildikó Pilán | Teacher Feedback - Student Revision Outcome | /N |
Documentation:
None
Written
Treebank,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
CreativeCommons Attribution 4.0 International
Size:
15000 sentences Production Status:
Newly created-finished
Use:
Sensitive Information Detection
-
Paper title:A Real-World Data Resource of Complex Sensitive Sentences Based on Documents from the Monsanto Trial
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Jan Neerbek | Monsanto Trial Document | /N |
Documentation:
readme file, english.
Speech/Written
Corpus,
Language Type:
Bilingual
Languages:
English German
Availability:
Freely Available
License:
CreativeCommons
Size:
62 GByte Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:LibriVoxDeEn: A Corpus for German-to-English Speech Translation and German Speech Recognition
-
Paper track:Speech/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Stefan Riezler | LibriVoxDeEN | /N |
Documentation:
English documentation




